Skip to content

Conversation

@makkruo
Copy link

@makkruo makkruo commented Nov 19, 2025

feat(text-splitters): add MySQL language support to RecursiveCharacterTextSplitter

Description
This PR adds native MySQL language support to the RecursiveCharacterTextSplitter in the langchain_text_splitters package.

Previously, the text splitters did not include MySQL. As a result, MySQL-specific syntax such as DELIMITER, stored procedures (CREATE PROCEDURE … BEGIN … END), and multi-statement blocks was not able to be handled.

This contribution introduces:

  • A new Language.MYSQL enum value
  • A MySQL-specific splitter configuration (separators and heuristics)
  • Integration into from_language() so users can easily load the MySQL splitter
  • Unit tests validating MySQL-aware splitting behavior
    These changes provide first-class MySQL support and expand the usability of LangChain for database-related applications.

Issue
Fixes #34058

Dependencies

  • No new dependencies required.
  • No changes to pyproject.toml or uv.lock.

Lint and Test Status
All checks have been successfully run from the text-splitters package root:

  • make format — passed
  • make lint — passed
  • make test — passed
  • make integration_tests — passed
    All added tests pass locally, and existing skipped tests are unrelated to this change.

@github-actions github-actions bot added feature text-splitters Related to the package `text-splitters` and removed feature labels Nov 19, 2025
@makkruo makkruo changed the title feat: add MySQL code text splitter with unit tests. feat(text-splitters): add MySQL language support to RecursiveCharacterTextSplitter Nov 20, 2025
@makkruo makkruo force-pushed the feature/mysql-text-splitter branch from 35e4ba1 to 57331c2 Compare November 20, 2025 15:12
@makkruo
Copy link
Author

makkruo commented Nov 20, 2025

@eyurtsev
Dear maintainers of Langchain,

I'm a new contributor to the Langchain open-source project. Recently I contributed some code related to the MySQL language splitter rules. As a newcomer, I'm not entirely familiar with Langchain's PR working process. Following the official contribution guide, I ran the make test, make lint, and make integration_tests workflows locally and submitted a PR from my own branch.

On the PR webpage, the CI runs were automatically triggered and all passed successfully. However, a pink exclamation mark appeared with the message "Merging is blocked. Code scanning is waiting for results from CodeQL for the commits..." I checked the CodeQL section under Actions and noticed that it didn’t run for my PR. I’m unsure whether this is normal or if CodeQL is required to run.

Should I wait for your review, or is there something I need to do next?

If you see this comment, I’d really appreciate your help. I’m eager to continue contributing to the Langchain community. Looking forward to your reply. Thank you!

@makkruo makkruo closed this Nov 21, 2025
@makkruo makkruo reopened this Nov 21, 2025
@makkruo makkruo force-pushed the feature/mysql-text-splitter branch from 7ce1707 to a559b13 Compare November 21, 2025 06:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature text-splitters Related to the package `text-splitters`

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add MySQL language text splitter support

1 participant